Goto

Collaborating Authors

 memory use




Parallelizing Support Vector Machines on Distributed Computers

Neural Information Processing Systems

Support Vector Machines (SVMs) suffer from a widely recognized scalability problem in both memory use and computational time. To improve scalability, we have developed a parallel SVM algorithm (PSVM), which reduces memory use through performing a row-based, approximate matrix factorization, and which loads only essential data to each machine to perform parallel computation. Let n denote the number of training instances, p the reduced matrix dimension after factorization ( p is significantly smaller than n), and m the number of machines. PSVM reduces the memory requirement from \MO ( n 2) to \MO ( np/m), and improves computation time to \MO ( np 2/m). Empirical studies on up to 500 computers shows PSVM to be effective.


Faster machine learning on larger graphs: how NumPy and Pandas slashed memory and time in…

#artificialintelligence

This week, StellarGraph released a new version of its open source library for machine learning on graphs. One of the most exciting features of StellarGraph 1.0 is a new graph data structure -- built using NumPy and Pandas -- that results in significantly lower memory usage and faster construction times. Consider that a graph created from Reddit, with more than 200 thousand nodes and 11 million edges, required almost 7GB of memory with previous iterations of the library's graph data structure. It also took 2.5 minutes to construct. In the new StellarGraph class's minimal form, that same Reddit graph now uses approximately 174MB.


Parallelizing Support Vector Machines on Distributed Computers

Zhu, Kaihua, Wang, Hao, Bai, Hongjie, Li, Jian, Qiu, Zhihuan, Cui, Hang, Chang, Edward Y.

Neural Information Processing Systems

Support Vector Machines (SVMs) suffer from a widely recognized scalability problem in both memory use and computational time. To improve scalability, we have developed a parallel SVM algorithm (PSVM), which reduces memory use through performing a row-based, approximate matrix factorization, and which loads only essential data to each machine to perform parallel computation. Let $n$ denote the number of training instances, $p$ the reduced matrix dimension after factorization ($p$ is significantly smaller than $n$), and $m$ the number of machines. PSVM reduces the memory requirement from $\MO$($n 2$) to $\MO$($np/m$), and improves computation time to $\MO$($np 2/m$). Empirical studies on up to $500$ computers shows PSVM to be effective.


Learning Finite State Representations of Recurrent Policy Networks

Koul, Anurag, Greydanus, Sam, Fern, Alan

arXiv.org Machine Learning

Recurrent neural networks (RNNs) are an effective representation of control policies for a wide range of reinforcement and imitation learning problems. RNN policies, however, are particularly difficult to explain, understand, and analyze due to their use of continuous-valued memory vectors and observation features. In this paper, we introduce a new technique, Quantized Bottleneck Insertion, to learn finite representations of these vectors and features. The result is a quantized representation of the RNN that can be analyzed to improve our understanding of memory use and general behavior. We present results of this approach on synthetic environments and six Atari games. The resulting finite representations are surprisingly small in some cases, using as few as 3 discrete memory states and 10 observations for a perfect Pong policy. We also show that these finite policy representations lead to improved interpretability.


Memory Lens: How Much Memory Does an Agent Use?

Dann, Christoph, Hofmann, Katja, Nowozin, Sebastian

arXiv.org Machine Learning

We propose a new method to study the internal memory used by reinforcement learning policies. We estimate the amount of relevant past information by estimating mutual information between behavior histories and the current action of an agent. We perform this estimation in the passive setting, that is, we do not intervene but merely observe the natural behavior of the agent. Moreover, we provide a theoretical justification for our approach by showing that it yields an implementation-independent lower bound on the minimal memory capacity of any agent that implement the observed policy. We demonstrate our approach by estimating the use of memory of DQN policies on concatenated Atari frames, demonstrating sharply different use of memory across 49 games. The study of memory as information that flows from the past to the current action opens avenues to understand and improve successful reinforcement learning algorithms.


Parallelizing Support Vector Machines on Distributed Computers

Zhu, Kaihua, Wang, Hao, Bai, Hongjie, Li, Jian, Qiu, Zhihuan, Cui, Hang, Chang, Edward Y.

Neural Information Processing Systems

Support Vector Machines (SVMs) suffer from a widely recognized scalability problem in both memory use and computational time. To improve scalability, we have developed a parallel SVM algorithm (PSVM), which reduces memory use through performing a row-based, approximate matrix factorization, and which loads only essential data to each machine to perform parallel computation. Let $n$ denote the number of training instances, $p$ the reduced matrix dimension after factorization ($p$ is significantly smaller than $n$), and $m$ the number of machines. PSVM reduces the memory requirement from $\MO$($n^2$) to $\MO$($np/m$), and improves computation time to $\MO$($np^2/m$). Empirical studies on up to $500$ computers shows PSVM to be effective.